Keir Fraser [Wed, 3 Feb 2010 09:16:11 +0000 (09:16 +0000)]
hvmloader: Fix CPU hotplug notify handler in ACPI DSDT.
By merging PRSC and NTFY methods we simplify the code, improve
efficiency, and fix a bug where PRSC iterated 0-255 but NTDY could
only handle CPU numbers 0-127.
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
Keir Fraser [Mon, 1 Feb 2010 14:03:47 +0000 (14:03 +0000)]
pygrub: support parsing of syslinux configuration files
Allows booting from ISOs which use isolinux as well as guests using
extlinux.
Also add copyright header to GrubConf.py, I think the grub2 support
added last year qualifies as a substantial change.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Keir Fraser [Mon, 1 Feb 2010 14:03:06 +0000 (14:03 +0000)]
hvm, s3: HVM guest RTCs become unsync'ed across host S3.
Signed-off-by: Kamala Narasimhan <kamala.narasimhan@citrix.com>
Keir Fraser [Fri, 29 Jan 2010 08:59:46 +0000 (08:59 +0000)]
tools/gtraceview: fix SIGFPE
If there are 0 or 1 valid record in xentrace file,
SIGFPE will occur. Fix it.
Signed-off-by: Yu Zhiguo <yuzg@cn.fujitsu.com>
Keir Fraser [Fri, 29 Jan 2010 08:55:27 +0000 (08:55 +0000)]
blktap2: Prefer AIO eventfd support on kernels >= 2.6.22
Mainline kernel support for eventfd(2) in linux aio was added between
2.6.21 and 2.6.22. Libaio after 0.3.107 has the header file, but
presently few systems support it. Neither do we rely on an up-to-date
libc6.
Instead, this patch adds a header which defines custom iocb_common
struct, and works around a potentially missing sys/eventfd.h.
Signed-off-by: Daniel Stodden <daniel.stodden@citrix.com>
Keir Fraser [Fri, 29 Jan 2010 08:54:51 +0000 (08:54 +0000)]
blktap2: Separate tapdisk raw I/O into different backends.
Hide tapdisk support for different raw I/O interfaces behind a new
struct tio. Libaio remains to dominate the interface, requiring
everyone to dispatch iocb/ioevent structs.
Backends:
- lio: Kernel AIO via libaio.
- rwio: Canonical read/write() mode.
Misc:
- Fixes a bug in tapdisk-vbd which locks up the sync io mode.
- Wants a PERROR macro in blktaplib.h
- Removes dead code in qcow2raw to make it link again.
Signed-off-by: Daniel Stodden <daniel.stodden@citrix.com>
Signed-off-by: Jake Wires <jake.wires@citrix.com>
Keir Fraser [Fri, 29 Jan 2010 08:54:22 +0000 (08:54 +0000)]
blktap2: Sort out tapdisk AIO init.
Move event callbacks registration into tapdisk-queue. This should also
obsoletes the dummy pollfd pipe in the synchronous I/O case.
Signed-off-by: Daniel Stodden <daniel.stodden@citrix.com>
Keir Fraser [Fri, 29 Jan 2010 08:53:52 +0000 (08:53 +0000)]
blktap2: Sort out tapdisk IPC init.
Move I/O and event callbacks setup out of tapdisk-server, into
tapdisk-ipc.
Signed-off-by: Daniel Stodden <daniel.stodden@citrix.com>
Keir Fraser [Fri, 29 Jan 2010 07:14:32 +0000 (07:14 +0000)]
libelf: make elf_phdr_is_loadable load read-only segments.
From: Brad Plant <bplant@iinet.net.au>
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
Keir Fraser [Fri, 29 Jan 2010 07:10:28 +0000 (07:10 +0000)]
pv-on-hvm: Correct the order of the argument of out*()
The order of the argument of outl() is wrong.
The correct order is outl(value, port). This causes kernel panic.
And outw() is also similar.
Signed-off-by: KUWAMURA Shin'ya <kuwa@jp.fujitsu.com>
Keir Fraser [Fri, 29 Jan 2010 06:50:23 +0000 (06:50 +0000)]
x86 mca: Be more careful for printk in MCE context
MCE may happen in printk context, and will cause deadlock if we try to
call printk again in MCE context.
A new level(mce_critical) is added to mce_verbosity for printk in mce
context. This level is only for developer that aware of such issue.
In mce_panic, force console unlock.
Singed-off-by: Jiang, Yunhong <yunhong.jiang@intel.com>
Keir Fraser [Fri, 29 Jan 2010 06:49:42 +0000 (06:49 +0000)]
x86 mca: Add MCE broadcast checkiing.
Some platform will broadcast MCE to all logical processor, while some
platform will not. Distinguish these platforms will be helpful for
unified MCA handler.
the "mce_fb" is a option to emulate the broadcast MCA in non-broadcast
platform. This is mainly for MCA software trigger.
Signed-off-by: Jiang, Yunhong <yunhong.jiang@intel.com>
Keir Fraser [Fri, 29 Jan 2010 06:49:13 +0000 (06:49 +0000)]
x86 mca: Fix the vMCE address translation for HVM guest.
Fix address translation when we inject a virtual MCE to HVM guest.
Signed-off-by: Jiang, Yunhong <yunhong.jiang@intel.com>
Keir Fraser [Fri, 29 Jan 2010 06:48:37 +0000 (06:48 +0000)]
x86 mca: Add the mised put_domain in UCR handler function.
Signed-off-by: Jiang, Yunhong <yunhong.jiang@intel.com>
Keir Fraser [Fri, 29 Jan 2010 06:48:00 +0000 (06:48 +0000)]
x86 mca: Not GP fault when guest write non 0s or 1s to MCA CTL MSRs.
a) For Mci_CTL MSR, Guest can write any value to it. When read back,
it will be ANDed with the physical value. Some bit in physical value
can be 0, either because read-only in hardware (like masked by AMD's
Mci_CTL_MASK), or because Xen didn't enable it.
If guest write some bit as 0, while that bit is 1 in host, we will
not inject MCE corresponding that bank to guest, as we can't
distinguish if the MCE is caused by the guest-cleared bit.
b) For MCG_CTL MSR, guest can write any value to it. When read back,
it will be ANDed with the physical value.
If guest does not write all 1s. In mca_ctl_conflict(), we simply
not inject any vMCE to guest if some bit is set in physical MSR
while is cleared in guest 's vMCG_CTL MSR.
Signed-off-by: Jiang, Yunhong <yunhong.jiang@intel.com>
Keir Fraser [Fri, 29 Jan 2010 06:47:24 +0000 (06:47 +0000)]
x86 mca: Handle the vMCA bank correctly
Currently the virtual MCE MSR assume all MSRs range from 0 to
MAX_NR_BANKS are always MCE MSR, this is not always correct. With this
patch, the mce_rdmsr/mce_wrmsr will only handle vMCE MSR range from 0
to the MCA banks in the host platform.
Please notice that some MSR beyond current MCA banks in the host
platform are really MCA MSRs, that should be handled by general MSR
handler.
Signed-off-by: Jiang, Yunhong <yunhong.jiang@intel.com>
Keir Fraser [Fri, 29 Jan 2010 06:45:45 +0000 (06:45 +0000)]
x86: Clean up c/s 20844:
ca0759a08057
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
Keir Fraser [Wed, 27 Jan 2010 08:59:47 +0000 (08:59 +0000)]
xend: destroy restored domain when its device doesn't exist
A migrated domain keeps on running even though its disk doesn't
exist. This situation must be undesirable.
Signed-off-by: Kouya Shimura <kouya@jp.fujitsu.com>
Keir Fraser [Tue, 26 Jan 2010 15:54:40 +0000 (15:54 +0000)]
pygrub: improve grub 2 support
* The "default" value can be a quoted string (containing an integer)
so strip the quotes before interpreting.
* The "set" command takes a variable with an arbitrary name so instead
of whitelisting the ones to ignore simply silently accept any set
command with an unknown variable.
* Ignore the echo command.
* Handle the function { ... } syntax. Previously pygrub would error
out with a syntax error on the closing "}" because it thought it was
the closing bracket of a menuentry.
This makes pygrub2 work with the configuration files generated by
Debian Squeeze today.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Keir Fraser [Tue, 26 Jan 2010 15:54:09 +0000 (15:54 +0000)]
x86: Polarity-switch method only effective in non-directed EOI case.
Signed-off-by: Xiantao Zhang <xiantao.zhang@intel.com>
Keir Fraser [Tue, 26 Jan 2010 15:53:52 +0000 (15:53 +0000)]
x86: reduce EOI stack's size in per-cpu area.
Only dynamic vectors uses EOI stack, so the size
can be safely reducd to NR_DYNAMIC_VECTORS.
Signed-off-by : Xiantao Zhang <xiantao.zhang@intel.com>
Keir Fraser [Tue, 26 Jan 2010 15:53:01 +0000 (15:53 +0000)]
x86: Directly clear all pending EOIs once MSI info changed
As to unmaskable MSI, its deferred EOI policy only targets
for avoiding IRQ storm. It should be safe to clear pending
EOIs in advance when guest irq migration occurs, because next
interrupt's EOI write is still deferred, and also can avoid
storm.
Signed-off-by: Xiantao Zhang <xiantao.zhang@intel.com>
Keir Fraser [Tue, 26 Jan 2010 15:52:30 +0000 (15:52 +0000)]
x86: Revert Cset 20334:
dcc5d5d954e9
Recording old MSI info doesn't solve all the corner cases
when guest's irq migration occurs.
Signed-off-by : Xiantao Zhang <xiantao.zhang@intel.com>
Keir Fraser [Tue, 26 Jan 2010 15:51:53 +0000 (15:51 +0000)]
Update Xen version to 4.0.0-rc3-pre
Keir Fraser [Tue, 26 Jan 2010 14:15:05 +0000 (14:15 +0000)]
Added tag 4.0.0-rc2 for changeset
e5e4573bcaba
Keir Fraser [Tue, 26 Jan 2010 14:15:01 +0000 (14:15 +0000)]
Update Xen version to 4.0.0-rc2
Keir Fraser [Tue, 26 Jan 2010 07:51:20 +0000 (07:51 +0000)]
VT-d: add "iommu=workaround_bios_bug" option
Add this option to workaround BIOS bugs. Currently it ignores DRHD
if "all" devices under its scope are not pci discoverable. This
workarounds a BIOS bug in some platforms to make VT-d work. But note
that this option doesn't guarantee security, because it might ignore
DRHD.
So there are 3 options which handle BIOS bugs differently:
iommu=1 (default): If detect non-existent device under a DRHD's
scope, or find incorrect RMRR setting (base_address > end_address),
disable VT-d completely in Xen with warning messages. This guarantees
security when VT-d enabled, or just disable VT-d to let Xen work
without VT-d.
iommu=force: it enforces to enable VT-d in Xen. If VT-d cannot be
enabled, it will crashes Xen. This is mainly for users who must need
VT-d.
iommu=workaround_bogus_bios: it workarounds some BIOS bugs to make
VT-d still work. This might be insecure because there might be a
device not protected by any DRHD if the device is re-enabled by
malicious s/w. This is for users who want to use VT-d regardless of
security.
Signed-off-by: Weidong Han <weidong.han@intel.com>
Keir Fraser [Tue, 26 Jan 2010 07:50:04 +0000 (07:50 +0000)]
tools/xsm: Expose Flask XSM AVC functions to user-space
This patch exposes the flask_access, flask_avc_cachestats,
flask_avc_hashstats, flask_getavc_threshold, flask_setavc_threshold,
and flask_policyvers functions to user-space. A python wrapper was
created for the flask_access function to facilitate policy based
user-space access control decisions. flask.h was renamed to libflask.h
to remove a naming conflict.
Signed-off-by : Machon Gregory <mbgrego@tycho.ncsc.mil>
Keir Fraser [Sat, 23 Jan 2010 08:28:01 +0000 (08:28 +0000)]
libxl: Fix libconfig install directory
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Tested-by: Ian Campbell <ian.campbell@citrix.com>
Keir Fraser [Sat, 23 Jan 2010 08:26:23 +0000 (08:26 +0000)]
pv-on-hvm: Only unplug emulated devices if requested via module parameter.
dev_unplug=[all,][ide-disks,][aux-ide-disks,][nics]
ide-disks: Unplug all emulated IDE disks (but not CD-ROMs)
aux-ide-disks: As above, but doesn't touch primary IDE master
nics: Unplug all emulated NICs
all: ide-disks and nics
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
Keir Fraser [Sat, 23 Jan 2010 08:23:24 +0000 (08:23 +0000)]
VT-d: improve RMRR validity checking
In order to make Xen more defensive to VT-d related BIOS issue, this
patch ignores a DRHD if all devices under its scope are not pci
discoverable, and regards a DRHD as invalid and then disable whole
VT-d if some devices under its scope are not pci discoverable. But if
iommu=force is set, it will enable all DRHDs reported by BIOS, to
avoid any security vulnerability with malicious s/s re-enabling
"supposed disabled" devices. Pls note that we don't know the devices
under the "Include_all" DRHD are existent or not, because the scope of
"Include_all" DRHD won't enumerate common pci device, it only
enumerates I/OxAPIC and HPET devices.
Signed-off-by: Noboru Iwamatsu <n_iwamatsu@jp.fujitsu.com>
Signed-off-by: Weidong Han <weidong.han@intel.com>
Keir Fraser [Fri, 22 Jan 2010 13:32:26 +0000 (13:32 +0000)]
Get libconfig tarball from xenbits
Download libconfig.tar.gz from xenbits.org extfiles rather than from
upstream. This insulates us from upstream networking failures and any
upstream changes to the files hosted etc.
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
Keir Fraser [Fri, 22 Jan 2010 11:01:18 +0000 (11:01 +0000)]
x86: check if desc->action is NULL when unbinding guest pirq
Before igb PF driver is unloaded, dom0 doesn't unload igbvf driver
automatically. When igb drver is unloaded, it invokes the
PHYSDEVOP_manage_pci_remove hypercall to remove the VFs and xen frees
the msi irqs by pci_cleanup_msi() -> ... -> dynamic_irq_cleanup() and
sets the desc->action to NULL. igbvf driver knows the VF is
disappearing via a hook ndo_stop() in dev_close() and tries to unbind
the pirq and xen would crash as the desc->action is NULL now.
Signed-off-by: Dexuan Cui <dexuan.cui@intel.com>
Keir Fraser [Fri, 22 Jan 2010 11:00:45 +0000 (11:00 +0000)]
blktap: fix blktapctrl abort
On rebooting a hvm, the blktapctrl daemon has died.
gdb shows the following call trace:
(gdb) where
#0 0x00000039d1830155 in raise () from /lib64/libc.so.6
#1 0x00000039d1831bf0 in abort () from /lib64/libc.so.6
#2 0x00000039d186a38b in __libc_message () from /lib64/libc.so.6
#3 0x00000039d1871634 in _int_free () from /lib64/libc.so.6
#4 0x00000039d1874c5c in free () from /lib64/libc.so.6
#5 0x0000003320a01bdd in ueblktap_probe (h=3D0x6073b0,=20
w=<value optimized out>, bepath_im=<value optimized out>) at
xenbus.c:270
#6 0x0000003320a020e0 in xs_fire_next_watch (h=3D0x6073b0) at
xs_api.c:355
#7 0x0000000000401785 in main (argc=3D<value optimized out>,
argv=<value optimized out>) at blktapctrl.c:907
There is a case that "/local/domain/0/backend/tap/<dom_id>" exists but
"/local/domain/<dom_id>/vm" is not in the xenstore.
Signed-off-by: Kouya Shimura <kouya@jp.fujitsu.com>
Keir Fraser [Fri, 22 Jan 2010 10:59:51 +0000 (10:59 +0000)]
libxc: mmapbatch-v2 adjustments
Just like the kernel, the fallback implementation of
xc_map_foreign_bulk() should clear the error indication array upon
success.
Also, a few allocations were needlessly using calloc() instead of
malloc().
Finally, in xc_domain_save() allocate the error indicator array once
(along with the other arrays) instead of using realloc() (without
error checking) in the loop body.
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Keir Fraser [Fri, 22 Jan 2010 10:59:03 +0000 (10:59 +0000)]
libxc: New hcall_buf_{prep,release} pre-mlock interface
Allow certain performance-critical hypercall wrappers to register data
buffers via a new interface which allows them to be 'bounced' into a
pre-mlock'ed page-sized per-thread data area. This saves the cost of
mlock/munlock on every such hypercall, which can be very expensive on
modern kernels.
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
Keir Fraser [Thu, 21 Jan 2010 15:13:00 +0000 (15:13 +0000)]
x86: kill msix_flush_writes()
The (only) two callers of it don't need it, as the MSI-X case of
msi_set_mask_bit() already does the necessary readl().
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Keir Fraser [Thu, 21 Jan 2010 15:12:38 +0000 (15:12 +0000)]
x86: dump full IRQ affinity
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Keir Fraser [Thu, 21 Jan 2010 15:12:17 +0000 (15:12 +0000)]
x86: add keyhandler to dump MSI state
Equivalent to dumping IO-APIC state; the question is whether this
ought to live on its own key (as done here), or whether it should be
chanined to from the 'i' handler.
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Keir Fraser [Thu, 21 Jan 2010 14:40:05 +0000 (14:40 +0000)]
xend: Dis-allow device assignment if PoD is enabled.
Signed-off-by: Dongxiao Xu <dongxiao.xu@intel.com>
Keir Fraser [Thu, 21 Jan 2010 11:27:11 +0000 (11:27 +0000)]
tools: fix sysfs error path
Attached patch fixes sysfs error path.
NetBSD also has a /proc/mounts file but no sysfs.
On Linux you can test this with sysfs not mounted.
Signed-off-by: Christoph Egger <Christoph.Egger@amd.com>
Keir Fraser [Thu, 21 Jan 2010 11:26:26 +0000 (11:26 +0000)]
VT-d: warn on bogus RMRR entry
Signed-off-by: Weidong Han <weidong.han@intel.com>
Keir Fraser [Thu, 21 Jan 2010 09:13:46 +0000 (09:13 +0000)]
xentrace: XC_PAGE_SIZE should be used
20827:
fad80160c001 cannot be compiled on ia64:
xentrace.c:647: error: 'PAGE_SIZE' undeclared (first use in this
This patch fixes it.
Signed-off-by: KUWAMURA Shin'ya <kuwa@jp.fujitsu.com>
Keir Fraser [Thu, 21 Jan 2010 09:12:01 +0000 (09:12 +0000)]
VT-d: improve RMRR validity checking
Currently, Xen checks RMRR range and disables VT-d if RMRR range is
set incorrectly in BIOS rigorously. But, actually we can ignore the
RMRR if the device under its scope are not pci discoverable, because
the RMRR won't be used by non-existed or disabled devices.
This patch ignores the RMRR if the device under its scope are not pci
discoverable, and only checks the validity of RMRRs that are actually
used. In order to avoid duplicate pci device detection code, this
patch defines a function pci_device_detect for it.
Signed-off-by: Weidong Han <weidong.han@intel.com>
Keir Fraser [Thu, 21 Jan 2010 09:11:06 +0000 (09:11 +0000)]
VT-d: handle return value of deassign_device
deassign_device may fail, so need to capture its failure for
appropriate handling. This patch captures return values of
deassign_device, and prints error messages if it fails.
In addition, this patch also fixes some code style issues.
Signed-off-by: Weidong Han <Weidong.han@intel.com>
Keir Fraser [Thu, 21 Jan 2010 09:03:20 +0000 (09:03 +0000)]
libxc: Unbreak HVM live migration after
0b138a019292.
0b138a019292 was a little too ambitious replacing xc_map_foreign_batch
with xc_map_foreign_pages in xc_domain_restore. With HVM, some of the
mappings are expected to fail (as "XTAB" pages).
Signed-off-by: Brendan Cully <brendan@cs.ubc.ca>
Keir Fraser [Thu, 21 Jan 2010 09:03:00 +0000 (09:03 +0000)]
xend: Unbreak live migration with tapdisk2 after 20691:
054042ba73b6
vm.image does not exist at this point in the restore process.
I haven't looked at the memory_sharing code. It's likely something
better is needed to make that work across relocation.
Signed-off-by: Brendan Cully <brendan@cs.ubc.ca>
Keir Fraser [Wed, 20 Jan 2010 20:36:19 +0000 (20:36 +0000)]
libxl, hvm: Add support to trigger power or sleep button events
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Keir Fraser [Wed, 20 Jan 2010 20:34:19 +0000 (20:34 +0000)]
hvm: Add ACPI fixed sleep button
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Keir Fraser [Wed, 20 Jan 2010 20:33:35 +0000 (20:33 +0000)]
xentrace: Per-cpu xentrace buffers
In the current xentrace configuration, xentrace buffers are all
allocated in a single contiguous chunk, and then divided among logical
cpus, one buffer per cpu. The size of an allocatable chunk is fairly
limited, in my experience about 128 pages (512KiB). As the number of
logical cores increase, this means a much smaller maximum per-cpu
trace buffer per cpu; on my dual-socket quad-core nehalem box with
hyperthreading (16 logical cpus), that comes to 8 pages per logical
cpu.
This patch addresses this issue by allocating per-cpu buffers
separately.
Signed-off-by: George Dunlap <dunlapg@umich.edu>
Keir Fraser [Wed, 20 Jan 2010 09:51:38 +0000 (09:51 +0000)]
xend: Fix 20825:
49a2c1069e14
Converting an Python Int, sizeof(long) already returns byte length
rather than bit length so do not divide-by-8.
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
Keir Fraser [Wed, 20 Jan 2010 09:33:59 +0000 (09:33 +0000)]
xend: Properly interpret vcpu_avail Long Integer in xc.hvm_build().
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
Keir Fraser [Tue, 19 Jan 2010 15:44:54 +0000 (15:44 +0000)]
Enable IOMMU by default.
Can be disabled with 'iommu=0' boot parameter.
Note that iommu_inclusive_mapping is now also enabled by default, to
deal with systems with broken BIOS tables specifying bad RMRRs. Old
behaviour can be specified via 'iommu_inclusive_mapping=0'.
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
Keir Fraser [Tue, 19 Jan 2010 10:56:59 +0000 (10:56 +0000)]
x86: Clean up TSC_RELIABLE handling after 20705:
a74aca4b9386
Set the feature by default and disable it if we can detect TSC warp,
rather than leaving the feature cleared and setting it if we happen
not to detect TSC warp.
This way round fixes dom0 kernel boot for Masaki Kanno.
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
Keir Fraser [Tue, 19 Jan 2010 09:40:30 +0000 (09:40 +0000)]
xc_domain_save: allocate pfn_err before use
Due to recent changes related to xc_map_foreign_bulk, xc_domain_save
segfaults because it tries to use pfn_err without allocating it first.
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Keir Fraser [Mon, 18 Jan 2010 14:49:00 +0000 (14:49 +0000)]
libxl: fix "xl list" output
This simple patch fixes the "xl list" output and cleans
libxl_list_domain after the recent API changes to list domains and
VMs.
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Keir Fraser [Mon, 18 Jan 2010 14:48:18 +0000 (14:48 +0000)]
minios: implement xc_map_foreign_bulk
In order to do so it modifies map_frames_ex and do_map_frames to take
an int *err as parameter and return any error that way.
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Keir Fraser [Mon, 18 Jan 2010 10:37:28 +0000 (10:37 +0000)]
Revert 20746:
042b371d8728 --- Breaks stubdoms.
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
Keir Fraser [Mon, 18 Jan 2010 10:35:36 +0000 (10:35 +0000)]
x86 hvm: Pre-allocate per-cpu HVM memory before bringing CPUs online
after boot. Avoids doing the allocations on the CPU itself, while in a
not-fully-online state and with irqs disabled. This way we avoid
assertions about irqs being disabled in e.g., tlb flush logic.
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
Keir Fraser [Sun, 17 Jan 2010 18:20:04 +0000 (18:20 +0000)]
xend: Use max_node_id rather than nr_nodes where appropriate.
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
Keir Fraser [Sun, 17 Jan 2010 18:07:10 +0000 (18:07 +0000)]
Change default cpufreq governor to ondemand
Back in c/s 18950 the default cpufreq governor was set to userspace
(it had previously been performance). However, since there is no
supplied userspace program or script that will change the frequency
this is at best a no-op. Worse, on some hardware with some BIOS
revisions, this actually sets the CPUs running at their lowest
frequency rather than their highest and there is a corresponding (and
initially puzzling) drop in performance.
This patch changes the default governor to "ondemand" which should
make it the same as the Linux default and will provide power savings
for the majority without needing to write a userspace governor. For
those that want to install their own governor, that is still possible.
Signed-off-by: John Haxby <john.haxby@oracle.com>
Keir Fraser [Sun, 17 Jan 2010 18:05:32 +0000 (18:05 +0000)]
libxenlight: add a list-vm options to xl that only list vms uuid, domid, name
Signed-off-by: Vincent Hanquez <vincent.hanquez@eu.citrix.com>
Keir Fraser [Sun, 17 Jan 2010 18:05:03 +0000 (18:05 +0000)]
libxenlight: separate logically list_vm and list_domain
previously list_domain was something between listing VM and domains.
provide 2 separates API calls to list domains and list vms. the list
vms API filters utility domains like stubdomains, and domain 0
change is_stubdom to properly check the integer and also return a
boolean value.
Signed-off-by: Vincent Hanquez <vincent.hanquez@eu.citrix.com>
Keir Fraser [Sun, 17 Jan 2010 18:03:00 +0000 (18:03 +0000)]
Keir Fraser [Sun, 17 Jan 2010 18:01:08 +0000 (18:01 +0000)]
xend: NUMA: fix division by zero on unpopulated nodes
nodes without memory will currently be disabled by also moving the
physical cores connected to them to other nodes. This leads to nodes
without CPUs and thus to a division by zero in the node allocation
algorithm. Attached patch fixes this by checking for 0 before the
division. This fixes domain creation on boxes with memory-less nodes.
Signed-off-by: Andre Przywara <andre.przywara@amd.com>
Keir Fraser [Sun, 17 Jan 2010 17:57:44 +0000 (17:57 +0000)]
libxenlight: Add the line number to the config file parsing error message
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Keir Fraser [Sun, 17 Jan 2010 17:57:11 +0000 (17:57 +0000)]
libxl: add a newline to xl logging
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Keir Fraser [Fri, 15 Jan 2010 08:27:27 +0000 (08:27 +0000)]
x86: A further fix to xen_in_range().
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
Keir Fraser [Thu, 14 Jan 2010 14:11:25 +0000 (14:11 +0000)]
Make sure the minimum shadow allocation is never zero.
Signed-off-by: Tim Deegan <Tim.Deegan@citrix.com>
Keir Fraser [Thu, 14 Jan 2010 14:10:40 +0000 (14:10 +0000)]
libxc: Fix IOCTL_PRIVCMD_MMAPBATCH_V2 fallback check
privcmd_ioctl returns EINVAL if the type is not supported.
This fixes the guest booting issue caused by C/S 20791.
Signed-off-by: Dongxiao Xu <dongxiao.xu@intel.com>
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Keir Fraser [Thu, 14 Jan 2010 11:46:53 +0000 (11:46 +0000)]
x86: Fix and clarify 20803:
50bd4235f486
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
Keir Fraser [Thu, 14 Jan 2010 10:14:17 +0000 (10:14 +0000)]
xend: Fix wait-for-stubdom loop to avoid possible infinite loop
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
Keir Fraser [Thu, 14 Jan 2010 10:12:58 +0000 (10:12 +0000)]
Linux: Use losetup -f where available.
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
Keir Fraser [Thu, 14 Jan 2010 10:03:44 +0000 (10:03 +0000)]
x86: Fix xen_in_range() for fragmented percpu data area.
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
Keir Fraser [Thu, 14 Jan 2010 09:44:08 +0000 (09:44 +0000)]
xend, NUMA: Fix computation of needed nodes
Enumerate the best nodes and add CPU affinity until all VCPUs can be
backed by at least one physical core. This should fix problems with
asymmetric NUMA configurations and cropped number of CPUs in Xen.
Signed-off-by: Andre Przywara <andre.przywara@amd.com>
Keir Fraser [Thu, 14 Jan 2010 09:42:40 +0000 (09:42 +0000)]
libxenlight: fix name to domid conversion.
also simplify massively the function that iterate over all domains to
find the corresponding domid to a name.
Signed-off-by: Vincent Hanquez <vincent.hanquez@eu.citrix.com>
Keir Fraser [Thu, 14 Jan 2010 09:42:06 +0000 (09:42 +0000)]
libxenlight: add error in disk_add if phystype is not recognized
Signed-off-by: Vincent Hanquez <vincent.hanquez@eu.citrix.com>
Keir Fraser [Thu, 14 Jan 2010 09:41:34 +0000 (09:41 +0000)]
libxenlight: add fuse around generic_device_add related to invalid kinds
prevent segfault in case the backend or frontend kinds have
not been set to a correct kind value (or not initilized).
Signed-off-by: Vincent Hanquez <vincent.hanquez@eu.citrix.com>
Keir Fraser [Thu, 14 Jan 2010 09:40:55 +0000 (09:40 +0000)]
libxenlight: initialize enum to 1, to prevent defaulting to the 0
values when structure when not properly initialized by the client.
Signed-off-by: Vincent Hanquez <vincent.hanquez@eu.citrix.com>
Keir Fraser [Thu, 14 Jan 2010 09:40:01 +0000 (09:40 +0000)]
libxenlight: add some return values testing in stubdom
Signed-off-by: Vincent Hanquez <vincent.hanquez@eu.citrix.com>
Keir Fraser [Wed, 13 Jan 2010 08:33:34 +0000 (08:33 +0000)]
x86 hvm: Change default setting of guest CPUID RDTSCP bit
Expose RDTSCP CPUID to guest only when tsc_mode == TSC_MODE_DEFAULT
and host_tsc_is_safe() returns 1.
Signed-off-by: Dongxiao Xu <dongxiao.xu@intel.com>
Keir Fraser [Wed, 13 Jan 2010 08:18:38 +0000 (08:18 +0000)]
x86: fix unmaskable msi assignment issue.
Currently, unmasked msi irq's EOI write is deferred untile guest
writes EOI, so needs to keep eoi_vector unchanged before guest writes
EOI. However, irq migration breaks the assumption and changs
eoi_vector when interrupts are generated through new vector.
The patch removes the dependency for eoi_vector and directly recoreds
the irq info in the EOI stack, and when guest writes EOI, just do the
physical EOI for the specific irq(recorded in EOI stack)on the cpus
according to the cpu_eoi_map.
Signed-off-by: Xiantao Zhang <xiantao.zhang@intel.com>
Keir Fraser [Wed, 13 Jan 2010 08:17:00 +0000 (08:17 +0000)]
x86: minor cleanup to arch_memory_op()
There's a function-wide variable rc, so no need to re-declare it in
individual case handling blocks.
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Keir Fraser [Wed, 13 Jan 2010 08:16:37 +0000 (08:16 +0000)]
Update Xen version to 4.0.0-rc2-pre
Keir Fraser [Wed, 13 Jan 2010 08:14:01 +0000 (08:14 +0000)]
x86: add and use XEN_DOMCTL_getpageframeinfo3
To support wider than 28-bit MFNs, add XEN_DOMCTL_getpageframeinfo3
(with the type replacing the passed in MFN rather than getting or-ed
into it) to properly back xc_get_pfn_type_batch().
With xc_get_pfn_type_batch() only used internally to libxc, move its
prototype from xenctrl.h to xc_private.h.
This also fixes a couple of bugs in pre-existing code:
- the failure path for init_mem_info() leaked minfo->pfn_type,
- one error path of the XEN_DOMCTL_getpageframeinfo2 handler used
put_domain() where rcu_unlock_domain() was meant, and
- the XEN_DOMCTL_getpageframeinfo2 handler could call
xsm_getpageframeinfo() with an invalid struct page_info pointer.
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Keir Fraser [Wed, 13 Jan 2010 08:12:56 +0000 (08:12 +0000)]
libxc: use new (replacement) mmap-batch ioctl
Replace all calls to xc_map_foreign_batch() where the caller doesn't
look at the passed in array to check for errors by calls to
xc_map_foreign_pages(). Replace all remaining calls by such to the
newly introduced xc_map_foreign_bulk().
As a sideband modification (needed while writing the patch to ensure
they're unused) eliminate unused parameters to
uncanonicalize_pagetable() and xc_map_foreign_batch_single(). Also
unmap live_p2m_frame_list earlier in map_and_save_p2m_table(),
reducing the peak amount of virtual address space required.
All supported OSes other than Linux continue to use the old ioctl for
the time being.
Also change libxc's MAJOR to 4.0 to reflect the API change.
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Keir Fraser [Tue, 12 Jan 2010 07:17:40 +0000 (07:17 +0000)]
Added tag 4.0.0-rc1 for changeset
67b5ad8ae87e
Keir Fraser [Tue, 12 Jan 2010 07:11:28 +0000 (07:11 +0000)]
Update Xen version for 4.0.0-rc1
Keir Fraser [Tue, 12 Jan 2010 07:06:12 +0000 (07:06 +0000)]
libxenlight: remove ctx dangerously passed to children
apart from ctx->waitpid, it's potentially harmful to call into
logging.
Signed-off-by: Vincent Hanquez <vincent.hanquez@eu.citrix.com>
Keir Fraser [Tue, 12 Jan 2010 07:05:22 +0000 (07:05 +0000)]
libxenlight: remove ctx argument to exec
Signed-off-by: Vincent Hanquez <vincent.hanquez@eu.citrix.com>
Keir Fraser [Tue, 12 Jan 2010 07:04:46 +0000 (07:04 +0000)]
libxenlight: typo in old patch, lead to waitpid forever instead of
waitpid with WNOHANG
fixes qemu starting problem
Signed-off-by: Vincent Hanquez <vincent.hanquez@eu.citrix.com>
Keir Fraser [Tue, 12 Jan 2010 07:03:54 +0000 (07:03 +0000)]
libxenlight: misc cleanup
wait_for_device_model expect two pointer as the end not 2 integers.
remove debugging message in libxl_list
Signed-off-by: Vincent Hanquez <vincent.hanquez@eu.citrix.com>
Keir Fraser [Tue, 12 Jan 2010 07:03:14 +0000 (07:03 +0000)]
libxenlight: do not try to set memory target with a number we haven't
verified in set-mem.
checking that memory string conversion what done properly instead of
sending a request to balloon a domain to 0 memory.
Signed-off-by: Vincent Hanquez <vincent.hanquez@eu.citrix.com>
Keir Fraser [Tue, 12 Jan 2010 07:02:29 +0000 (07:02 +0000)]
libxenlight: tests a lots more of xl return value inside the library
and in xl.
introducing a domain where the xenguest build function has fail, lead
to having xenstored receiving SIGBUS, since it's trying to access some
of the domain's memory, which haven't been properly allocated. (it
doesn't seems to be a way to make xenstored more robust to this though
since xc_map_foreign_range just succeed).
make xl a lot more robust regarding all those random errors possible.
Signed-off-by: Vincent Hanquez <vincent.hanquez@eu.citrix.com>
Keir Fraser [Tue, 12 Jan 2010 07:01:21 +0000 (07:01 +0000)]
blktap: make memshr optional
Attached patch makes memshr optional for blktap/blktap2.
This fixes build for platforms where memshr isn't build on.
While there, make indentation consistent.
Signed-off-by: Christoph Egger <Christoph.Egger@amd.com>
Keir Fraser [Tue, 12 Jan 2010 06:56:56 +0000 (06:56 +0000)]
xend, pciquirk: fix uninitialized variable
Fixes uninitialized variable when there's no
PERMISSIVE_CONFIG_FILE
Signed-off-by: Christoph Egger <Christoph.Egger@amd.com>
Keir Fraser [Tue, 12 Jan 2010 06:55:24 +0000 (06:55 +0000)]
tools: build fixes for NetBSD
Signed-off-by: Christoph Egger <Christoph.Egger@amd.com>
Keir Fraser [Sat, 9 Jan 2010 08:14:44 +0000 (08:14 +0000)]
x86_32: Fix the build.
Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
Keir Fraser [Fri, 8 Jan 2010 11:48:36 +0000 (11:48 +0000)]
libxenlight: initialize domid to -1 in domain_create
prevent call site that doesn't check return values to try to do
operation on domain 0.
instead they use domid -1, which is unlikely to exist.
Signed-off-by: Vincent Hanquez <vincent.hanquez@eu.citrix.com>
Keir Fraser [Fri, 8 Jan 2010 11:48:02 +0000 (11:48 +0000)]
libxenlight: don't try to delete path when they doesn't exists.
fix segfault in destroy when creation hasn't been done properly.
Signed-off-by: Vincent Hanquez <vincent.hanquez@eu.citrix.com>